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ABSTRACT 


An  experimental  data  base  system,  called  SYLLOG ,  is 
described.  The  system,  which  has  been  prototyped  in  the 
language  SETL,  provides  a  screen-oriented  English-like 
language  for  use  by  non -prog rammers  in  setting  up  and  using 
a  data  base. 

To  set  up  a  new  data  base,  some  standardized  English 
sentences  are  typed  in,  and  are  combined  into  syllogisms 
which  indicate  how  the  data  will  be  interpreted.  Then,  once 
the  data  have  been  loaded,  the  knowledge  in  the  syllogisms 
is  used  for  retrievals. 

The  knowledge  is  used  for  retrievals  by  a  backchaining 
algorithm  which  operates  on  the  syllogisms  alone.  A  tree 
resulting  from  the  backchaining  controls  an  iterative 
algorithm  which  searches  the  data  base.  It  is  shown  that  the 
combined  backch^in-iteration  algorithm  is  correct  for 
schemas  in  which  no  syllogism  calls  itself,  and  that  under 
this  restriction,  the  query  language  is  at  least  as  powerful 
as  the  relational  algebra.  An  extension  is  described  to 
handle  recursive  syllogisms,  such  as  those  which  yield  the 
transitive  closure  of  a  relation. 
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1.  INTRODUCTION 


The  relational  model  for  data  bases  [4]  effectively 
frees  a  user  from  the  details  of  physical  access  to  data, 
and  it  provides  an  uncluttered  framework  in  which  such 
topics  as  normalization  and  the  meaning  of  updates  can  be 
discussed  [5,7].  Yet,  for  people  who  are  not  programmers  or 
mathematicians,  relational  data  bases  can  be  difficult  to 
use,  even  when  provided  with  a  high  level  query  language 
such  as  Query-by-Example  [12].  One  view  of  this  difficulty 
is  that,  while  the  user  knows  many  common  sense  rules  about 
the  real  world  situation  which  a  data  base  describes,  the 
data  base  system  does  not  (it  only  has  the  raw  data),  so 
there  is  plenty  of  room  for  misunderstandings. 

In  work  on  computer  systems  which  can  represent  the 
knowledge  of  a  human  expert,  [9]  the  emphasis  is  on 
capturing  everyday  rules  of  thumb  about  a  particular  subject 
(e.g.  medical  diagnosis),  and  then  using  the  rules  to  make 
deductions.  Very  often,  such  rules  are  expressed  in  a  form 

If  conjunction  of  premises  then  conclusion 

and  are  called  production   rules.   Such  rules  are  chained 
together  to  make  deductions. 

Clearly,  both  the  relational  model  of  data  and  the 
production  rule  model  share  some  features  with  the  first 
order  predicate  calculus  [3].  A  relational  data  base  can  be 
viewed  as  a  set  of  explicitly  listed  predicates  (a  model), 
and  a  set  of  production  rules  can  be  thought  of  as  rules  of 
inference  for  making  deductions.  However,  logic,  as  a 
formalism  for  everyday  computer  use,  is  beset  by  the  problem 
that  its  notation  is  difficult  for  non-specialists  to  learn 
and  use.  For  the  computer  scientist,  automatic  deduction  in 
first  order  logic  is  undecidable  in  general,  and  in 
decidable  subcases  can  consume  excessive  amounts  of  computer 
time  in  solving  quite  small  problems.  Since  data  bases,  can 
be  quite  large,  there  is  a  difficulty  in  applying  automatic 
theorem  proving  directly  for  retrieval. 

Yet,  substantial  progress  is  being  made  in  bringing 
techniques  from  logic  into  the  realm  of  practical 
computation.  In  the  programming  language  PROLOG  [8],  a 
program  is  a  set  of  ordered  sequences  of  logical  clauses.  A 
clause  can  be  a  simple  ground  (variable-free)  assertion, 
which  can  be  regarded  as  a  row  in  a  relation  in  a  data  base, 
or  it  can  be  a  conjunction  of  predicates  containing 
variables  which  implies  a  conclusion  predicate,  in  which 
case  it  can  be  regarded  as  a  production  rule.  So,  for  small 
data  bases,  PROLOG  contains  the  means  to  store  data,  and  to 
make   deductions   about   the   data   using   production   rules. 
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There  are  two  drawbacks  to  using  PROLOG  directly  for  a 
practical  data  base  system.  First,  PROLOG  notation,  though 
natural  for  computer  scientists,  is  probably  difficult  for 
most  non-specialists  to  use.  For  example,  an  otherwise 
correct  retrieval  program,  in  which  the  order  of  two  clauses 
is  reversed,  can  enter  an  infinite  loop.  Second,  the 
execution  mechanism  in  current  implementations  is  a  depth 
first  backtrack  search  over  internal  (or  virtual)  storage; 
the  problem  of  efficiently  searching  external  (e.g.  disk) 
storage  has  yet  to  be  addressed. 

In  section  2  of  this  paper,  we  describe  a 
screen-oriented,  English-like  language  for  setting  up  and 
using  a  data  base.  The  language  consists  of  syllogisms.  In 
section  3,  we  describe  a  backchaining  algorithm  which  forms 
the  first  stage  in  query  processing.  Backchaining  deals  only 
with  intensional  syllogisms,  not  with  the  extensional  data 
in  relations,  and  it  produces  a  tree  as  its  output.  In  the 
second  stage  of  query  p'rocessing ,  this  tree  is  used  to 
control  an  iterative  search  of  the  extensional  data  base.  It 
is  shown  that  the  two-stage  backchain-iteration  algorithm, 
applied  to  non-recursive  syllogisms,  is  correct,  and  has  the 
power  of  the  relational  algebra.  In  section  4,  the 
backchain-iteration  algorithm  is  extended  to  deal  with 
recursive  syllogisms ,  such  as  those  which  yield  the 
transitive  closure  of  a  relation.  Thus  the  SYLLOG  language 
becomes  strictly  more  powerful  than  the  relational  algebra. 
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2.  THE  SYLLOG  LANGUAGE 

This  section  describes  the  SYLLOG  language,  from  the 
user's  point  of  view,  by  means  of  an  example  of  setting  up 
and  using  a  data  base. 

Suppose  we  are  interested  in  knowledge  and  data  about 
cities,  about  ways  of  travelling  from  one  city  to  another, 
and  about  ways  of  getting  around  inside  a  city.  Then  we  will 
want  to  know  about  statements  such  as  "Greenwich  Village  is 
in  New  York",  "uptown  is  in  New  York",  and  knowledge  such  as 
"if  two  places  are  in  the  same  city  then  one  can  take  a  taxi 
from  one  to  the  other". 


2.1  Data  Definition  ;  setting  up: a  schema 

In  SYLLOG,   one   says   that  a  new  data  base  will  be 
concerned  with  such  facts  by  typirng-in 

_village  is  in  _New-York 
_uptown  is  in  _New-York 


can  take  a  taxi  from  _village  to  _uptown 

and  we  call  this  a  syllogism.  The  underlines  in  front  of 
words  indicate  example  items.  Thus  the  sentence 

_village  is  in  _New-York 

can  be  read  as  "the  data  base  will  be  concerned,  amongst 
other  things,  with  something  being  in  something  else,  such 
as  the  village  being  in  New-York".  The  whole  syllogism  can 
be  understood  as  "if  a  place  is  in  a  given  city,  and  a 
second  place  is  in  the  same  city,  then  one  can  take  a  taxi 
from  the  first  place  to  the  second  place". 

At  this  stage,  the  system  contains  no  data,  just  the 
statement  that  it  will  contain  two  relations  "...is  in..." 
and  "can  take  a  taxi  from  ...  to  ...",  and  some  knowledge 
about  the  second  relation  given  some  data  in  the  first. 
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2.2  Adding  Data 

One  could  now  type  in  some  data  like  this 

_village  is  in  _newyork 

uptown  New-York 

village  New-York 

white-house  Washington 

patent-office  Washington 

However,  it  is  not  necessary  to  type  the  first  sentence. The 
standard  SYLLOG  prompt  to  the  user  is  of  the  form 

Make  a  coiranand  nsing  these  and  other  sentences: 

_village  is  in  _New-Y^urk 

can  take  a  taxi  from  _village  to  _uptown 

Thus,  to  make  the  above  command  to  add  some  data,  one  first 
deletes  the  sentence  "c:in  take  a  taxi..."  from  the  screen, 
then  types  in  an  underline  followed  by  the  data.  If  the  data 
are  in  a  file,  one  can  give  the  command 

_village  is  in  _New-York 

<file  name> 

which  adds  the  contents  of  the  file  to  the  "is  in"  relation. 


2.3  Querying  the  Data  Base 

At  this  point,  the  system  contains  some  data  and  some 
elementary  knowledge  about  how  to  use  the  data.  Suppose  we 
want  a  list  of  places  in  Washington.  The  SYLLOG  prompt 
places  the  prototype  sentences 

_village  is  in  _New-York 

can  take  a  taxi  from  _village  to  _uptown 

on  the  screen.  We  then  delete  the  second  sentence  and  insert 
an  underline  to  get 

village  is  in   New-York 
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This  is  a  command  to  print  out  all  places  in  all  cities,  so 
before  executing  it  we  change  _New-York  so  that  the  screen 
reads 

_village  is  in  Washington 

Note  that  _village  is  only  a  place  holder  here;  the  query 
would  have  the  same  effect  if  we  used  _white-house  or  _x 
instead.  However,  if  we  changed  Washington  to  New-York,  we 
would  have  a  different  query. 

We  now  indicate  that  the  query  is  to  be  executed.  Data 
appear  on  the  screen  below  the  command  like  this. 

_village  is  in  Washington 

patent-office  Washington 
white-house     Washington 

and  we  have  answered  the  query  "which  places  are  in 
Washington" .  -  _ 

Note  that  we  could  also  have  made  the  query 

village  is  in  Washington 


i.e.  "is  the  village  in  Washington  ?".  The  resulting  screen 
is 


village  is  in  Washington 
EMPTY  ANSWER 
while  if  we  asked 

white-house  is  in  Washington 


the  resulting  screen  is  the  confirmation 
white  house  is  in  Washington 
white-house         Washington 
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Now  suppose  we  want  a  list  of  places  and  the  cities 
which  they  are  not  in.  As  before,  SYLLOG  prompts  with  the 
standard  sentences 

_village  is  in  _New-York 

can  take  a  taxi  from  _village  to  _uptown 

We  replace  the  second  sentence  on  the  screen  by  an 
underline,  and  change  the  first  sentence  by  inserting  "not", 
yielding  the  query 

_village  is  not  in  _New-York 

When  the  query  has  been  executed,  the  screen  shows 

village  is  not  in  _New-York 

patent-office       New-York 

uptown  .    Washington 
village  Washington 

white-house         New-York 

So  far,  we  have  just  queried  the  relation  "is  in"  into 
which  we  loaded  some  data..  Now  suppose  we  are  interested  in 
getting  from  place  to  place  by  taxi.  As  usual  SYLLOG  prompts 
with  the  sentences 

_village  is  in  _New-York 

can  take  a  taxi  from  _village  to  _uptown 

from  which  we  can  construct,  on  the  screen,  the  query 

can  take  a  taxi  from  _village  to   uptown 

After  the  query  is  executed,  the  screen  shows 

can  take  a  taxi  from  _village  to  _uptown 

patent-office  patent-office 

patent-office  white-house 

uptown  uptown 

uptown  village 

village  uptown 
village          '    village 

white-house  patent-office 

white-house  white-house 

The  answer  is  correct,  at  least  in  that  it  reflects  the  data 
and  the  syllogism 
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village  is  in  _New-York 
uptown  is  in   New-York 


can  take  a  taxi  from  _village  to  _uptown 

from  which  it  was  computed.  However,  the  answer  is  lacking 
in  real  world  knowledge;  people  don't  take  taxis  from  a 
place  to  the  same  place.  This  fact  can  be  included  by 
changing  the  syllogism  to 

_village  is  in  _New-York 
_uptown  is  in  _New-York 
village  not  EQUAL   uptown 


can  take  a  taxi  from  _village  to  _uptown 

where  EQUAL  is  a  built-in  test  in  SYLLOG .  (We  describe  how  a 
syllogism  can  be  changed  in  the  next  section).  With  the  new 
syllogism,  the  query  remains  t:he  same,  and  the  screen 
containing  the  answer  is 

can  take  a  taxi  from  village  to   uptown 


patent-office  white-house 

uptown  village 

village  uptown 

white-house  patent-office 

Note  that,  for  real  situations,  further  refinement  of  the 
syllogisms  might  be  needed;  for  example,  a  place  might  have 
two  names . 
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2.4  Querying,  Adding,  Deleting  and  Changing  Syllogisms 

In  the  last  section,  we  modified  a  syllogism  about 
taking  a  taxi  by  placing  an  extra  condition  in  its  premise. 
SYLLOG  allows  syllogisms  to  be  queried  and  modified. 

To  query  the  knowledge  base  of  syllogisms,  one  starts, 
as  usual  ,  with  the  prompt,  which  consists  of  the  sentences 
known  to  the  system.  In  this  case  we  are  interested  in  a 
rule,  or  rules,  about  taking  taxis,  so  we  just  leave  the 
sentence 

can  take  a  taxi  from  _village  to  _uptown 

on  the  screen.  This  is  understood  as  a  command  to  list  all 

of  the  syllogisms  having  r.his  sentence  (or  one  like  it  but 

for   renaming   of   _village   and  _uptown )   as   a  conclusion. 
Thus  the  rule 

village  is  in  _New-York 
_uptown  is  in  _New-York 


can  take  a  taxi  from  ^village  to  _uptown 

appears  on  the  screen.  The  syllogism  is  now  edited,  on  the 
screen,  to  its  new  form 

village  is  in  _New-York 
_uptown  is  in  _New-York 
_village  not  EQUAL  _uptown 


can  take  a  taxi  from  _village  to  _uptown 

and  replaces  the  old  syllogism. 

An  entirely  new  syllogism  can  simply  be  typed  in,  while 
a  syllogism  can  be  deleted  by  calling  it  up  on  the  screeen 
with  a  query  command,  and  then  erasing  it  from  the  screen. 

Although  it  is  easy  for  the  user  to  modify  the 
syllogisms,  this  should  be  done  with  some  thought,  since 
some  data  may  be  erased  in  the  process.  SYLLOG  marks  each 
fact  in  a  relation  according  to  whether  it  has  been  asserted 
in  an  add  or  change  command,  or  has  been  deduced  via  the 
syllogisms  during  a  query.  When  a  syllogism  is  added, 
changed,  or  deleted,  all  of  the  affected  deduced  data  is 
erased.  If  a  syllogism  which  contains  the  last  mention  of  a 
particular  sentence  is  deleted,  then  that  sentence  is 
dropped  from  the  prompt  list,  unless  there  are  facts  which 
have  been  asserted  about  it. 


-  9  - 

2.5  Further  Querying  of  the  Data  Base 

Suppose  we  now  add  some  data  and  a  syllogism  about 
travelling  by  train.  We  add  the  data 

can  go  by  train  from  _village  to  _Newark 

village  Hoboken 

Hoboken  Newark 

Newark  Washington 

and  the  syllogism 

can  go  by  train  from  _village  to  _Hoboken 
can  go  by  train  from  _Hoboken  to  _Newark 

can  go  by  train  from  _village  to  _Newark 

This  syllogism  is  special,  in  that  the  sentence  in  the 
conclusion  also  appears  in  the  premise.  We  say  that  the 
syllogism  is  recursive. 

The   SYLLOG   prompt   is   now  an   invitation   to  make   a 
command  using  the  sentences 

_village  is  in  _New-York 

can  take  a  taxi  from  _village  to  _uptown 

can  go  by  train  from  _village  to  _Newark 

To  ask  which  places  we  can  get  to  by  train  from  Hoboken,  we 
form  the  query 

can  go  by  train  from  Hoboken  to  _Newark 


When  the  query  has  been  made,  the  screen  shows 
can  go  by  train  from  Hoboken  to   Newark 


Hoboken  Newark 

Hoboken  Washington 

Note  that  Hoboken-Washington  is  not  in  the  data  we  asserted. 
It  has  been  deduced  by  using  the  syllogism  to  bridge  Newark 
in  Hoboken-Newark-Washington . 

If  we  now  form  the  query 

can  go  by  train  from  Washington  to  _village 

we  get  EMPTY  ANSWER.  Strictly,  this  is  correct,  since  the 
system  only  knows  about  trains  in  one  direction.  However,  it 
is  not  what  is  really  wanted,  so  we  add  the  syllogism 
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can  go  by  train  from  _village  to  _Newark 

can  go  by  train  from  _Newark  to  _village 

which  says  that  any  time  we  can  go  from  A  to  B  by  train,  we 
can  also  go  from  B  to  A.  If  we  now  repeat  the  question  about 
which  places  we  can  go  to  by  train  from  Washington,  we  get 
the  answer 

can  go  by  train  from  Washington  to  _Hoboken 

Washington  Hoboken 

Washington  Newark 

Washington  village 

Washington  Washington 

which,  apart  from  the  last  row,  is  reasonable.  The  last  row 
could  be  suppressed,  as  in  the  taxi  example  in  section  2.3, 
by  modifying  the  syllogisms. 


2.6  Deleting  and  Changing  Data 

Data  can  be  deleted  from  a  data  base  in  SYLLOG  by 
simply  bringing  it  to  the  screen  using  a  query,  and  then 
erasing  it  from  the  screen.  This  works  directly  for  asserted 
data.  However,  data  which  have  been  deduced  using  the 
syllogisms  cannot  be  deleted  in  this  way,  and  a  warning 
message  results. 

Similarly,  asserted  data  may  be  changed  by  first  using 
a  query  to  bring  it  to  the  screen.  Attempts  to  change 
deduced  data  yield  a  warning  message. 
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3.  QUERY  EVALUATION  BY  BACKCHAIN-ITERATION 

In  the  last  section  we  described  SYLLOG  from  the  point 
of  view  of  the  person  who  uses  the  system.  This  section 
describes  how  a  query  is  processed  by  SYLLOG,  in  the  case 
that  the  query  syllogisms  are  not  recursive.  A  proof  of 
correctness  of  the  query  algorithm  is  given,  and  it  is  shown 
that  non-recursive  collections  of  syllogisms  have  at  least 
the  power  of  the  relational  algebra.  Section  4  treats  the 
case  in  which  recursion  present. 


3.1  An  Example 

Syllogisms  are  stored  internally  in  SYLLOG  in  the  form 
of  production  rules.  For  example  the  syllogism 

village  is  in  _New-York; 
_uptown  is  in  _New-York 


can  take  a  taxi  from  _village  to  _uptown 

is  stored  in  a  form  corresponding  to 

C2(x,y)  <-  I, (x, z )I, (y, z ) 

where  C„  and  I,  are  system-generated  relation  names.  We 
call  this  form  a  rule,  and  we  write  a  rule  with  the 
conclusion  on  the  left  for  convenience  in  discussing 
backchaining . 

As  mentioned  in  section  2,  the  system  stores  a  current 
list  of  prompting  sentences,  which  contains  a  representative 
sentence  for  each  sentence  which  has  been  used  in  a 
syllogism  or  a  command.  If  two  sentences  differ  only  by 
renaming  of  variables,  or  by  instantiation  of  variables,  or 
by  the  presence  of  "not",  only  one  representative  is  stored 
in  the  prompt  list.  The  list  is  indexed  by  system-generated 
relation  names,  such  as  C„  and  I,  above.  Thus  a  sentence 
is  translated  into  its  relation  form  by  a  simple  pattern 
match  followed  by  a  table  lookup,  and  a  relation  is 
translated  into  a  sentence  by  a  table  lookup  followed  by  the 
substitution  of  the  appropriate  variables  or  constants  into 
the  sentence.  Translations  between  the  rule  and  syllogism 
forms  are  then  simply  made  sentence  by  sentence  or  relation 
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by  relation.  While  the  English-like  properties  of  the  SYLLOG 
language  should  be  easy  for  people  to  use,  it  is  plain  that 
very  little  computer  time  or  space  is  needed  to  translate 
between  syllogisms  and  rules. 

Suppose  that  only  the  two  syllogisms 

_village  is  in  _New-York 
uptown  is  in  _New-York 

can  take  a  taxi  from  _village  to  _uptown 

and 

can  get  a  taxi  from  _uptown  to  _village 
can  go  by  train  from  _village  to  _Hoboken 

can  go  from  _uptown  to  _Hoboken 
are  present,  and  that  they  are  represented  by  the  rules 

C2(x,y)  <-  I, (x,z )I, (y, z ) 

C^(x,z)  <-  C2<-^tY)C^{Y ,z)  . 
A  query 

can  go  from  uptovrn  to  Hoboken 


is   translated   into   C .( uptown , Hoboken ) ,   and   causes   the 
following  tree  to  be  constructed  from  the  rules: 


C  ,  ( uptown , Hoboken 


C2 (uptown,y 


I, ( uptown , z 


C^ (y, Hoboken  ) 


Il(y,z) 
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The  tree  is  constructed  using  only  the  query  and  the  rules, 
without  reference  to  the  facts  in  the  data  base.  Next,  the 
tree  is  interpreted  as  a  query  program,  and  executed  as 
follows.  Each  node  of  the  tree  is  assigned  an  initially 
empty  set,  called  its  extension.  Then,  each  leaf  node 
extension  is  made  equal  to  the  set  of  tuples,  from  the 
corresponding  asserted  data,  relevant  to  the  predicate  at 
the  node.  For  example,  the  leftmost  leaf  in  the  tree  gets 
the  rows  from  the  "is  in"  relation  which  start  with 
"uptown".  Next,  the  lowest  level  of  the  tree  is  executed,  in 
this  case  using  an  operation  equivalent  to  the  relational 
algebra   join  (we  write  *  for  join)  as 

I,  ( uptown, z)  *  I,(y,z) 

and  the  result 

uptown  uptown 

uptown  village 


is  placed  in  the  extension  of  C-.  Then,  the  upper  level  of 
the  tree  is  executed,  placing 

uptown  Hoboken 

in  the  extension  of  C..  Now  the  extension  of  the  root  has 
been  computed,  and  it  is  printed  as  the  answer. 

In  the  example,  the  tree  represents  a  conjunctive 
query.  In  general,  a  query  may  contain  disjuncts,  in  which 
two  or  more  rules  contribute  to  the  extension  of  a  node,  and 
negations,  in  which  case  a  node  is  extended  by  adding  tuples 
which  are  not  in  the  set  calculated  by  a  rule.  Thus  the 
backchain  procedure  yields  an  and-or-not  tree.  Note  that,  in 
the  construction _  of  the  tree,  selection  arguments  in  the 
query  (e.g.  uptown,  Hoboken  )•  are  propagated  downwards. 
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3.2  Definitions 

In  this  section,  we  set  down  the  definitions  which  are 
needed  to  prove  the  correctness  of  the  backchain-iteration 
algorithm. 

We  use  x,...,z  as  individual  variables,  a,...,d  as 
constants,  and  x,...,z  with  subscripts  to  denote  ordered 
lists  of  variables  and  constants.  A  substitution  is  a 
function  s,  from  variables  and  constants  to  variables  and 
constants,  such  that  s(a)=a  for  each  constant  a. 

A  knowledge  base  K  is  a  finite  set  of  clauses,  each 
of  the  form 

A(XQ)<-B^(y^)..B^(y^)-C^(z^)..-C^(z^) 

where  m+n  is  greater  or  equal  0,  A(Xq)  and  B^(y^), 
i=l..m,  are  positive  literals,  (e.g  P(x,y)),  and 
-C  .  ( z  .  ) ,  j  =  l..n,  are  negative  literals  (e.g.  -P(x,y)). 

If  m+n  >  0  the  clause  is  a  rule.  If  m+n  =  0,  then  the 
clause  in  an  assertion.  We  assume  that  assertions  contain 
no  variables,  and,  if 

A(Xq)<- 

is  an  assertion  in  K,  then  K  contains  no  rule  with  A(x„) 
on  the  left. 

Each  rule  is  such  that,  if  a  variable  appears  in  x^, 
then  it  appears  in  some  y.  or  z..  Also,  if  a  variable 
appears  in  some  z  .  ,  then  it  also  appe'ars  in  some  y .  . 

Note  that  given  a  clause  in  which  some  variable  appears 
in  a  z  .  but  not  in  a  y.,  we  can  often  replace  the  clause 
by  a  set  of  clauses  in  which  each  -variable  in  a  z  .  does 
appear  in  a  y..  For  example,  we  can  replace  -^ 

P(x,y)  <-  -Q(x,y) 

by  the  clauses 

P(x,y)  <-  Q^(x)Q2(y)-Q(x,y) 

Q^Cx)  <-  Q(x,y) 

Q2(y)  <-  Q(x,y). 
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Let   s   be   a   substitution.   We   say   that   A{s(x„)) 
follows  from  K,  written  A(s(x-,))  -!  K,  if 

(i)  A(s(Xq))  <-  ,  or 

(ii)  there  is  a  rule 

A{x.)<-B, (y, ) . .B  (y  )-C,(z,)..-C  (z  ) 
0     l-'l     m-'m    11      nn 

in  K  such  that 

a)  B.(s(y.))  -!  K  for  i  =  l..in,  and 

1    -^  1 

b)  it  is  not  the  case  that  C.(s(z.))  -1  K 

J  3 

for  any  j  in  {l,..,n}. 

Where  K  is  understood,  we  write  -!  instead  of  -!  K. 

A   program   for   A(x_)   is   a   tree   with   root   A{x-) 
defined  by  the  following.  If  there  exists  a  rule 

A(x^' X-B, (y^  ) . .B  (y  )-C,(z,)..-C  (z  ) 
0      l-'l    m-'mll      nn 

and  a  substitution  s  such  that  s  ( x„ '  )  =  x-,  then  (using 
s  to  rename  variables  which  are  already  in  the  tree)  add 

A(Xq) 


<-B"(s(yT  ) ) . .B  (s(y  ) )-C,  (s(z,  ) ) . .-C  (s(z  )) 
1    -'I     m   -'m    1    1      n    n 


to  the  tree  below  the  root.  If  there  is  a  program  for 
B.(s(y.))  or  C.(s(z.))  then  add  that  to  the  tree 
also.  -^     -' 

Note  that  the  program  tree  is  finite  only  if  no  rule 
eventually  calls  itself  as  a  program.  We  assume  this  to  be 
the  case  for  now. 
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The   extension   of   a   program   tree   is   defined   as 
follows : 

(1)  Each  node  is  assigned  an  empty  set,  called  its 
extension . 

(2)  For  each  leaf  Ciy^)  ,  if  C(s(yQ))<-  is  in  K  for  some 
substitution  s,  then  place  C(s(y^))  in  the  extension  of 
the  leaf.  Then  mark  the  leaf  extended. 


(3)    If 


A(x_)<-B, (y, ) . .B    (y    ) -C, ( z, ) . . -C„ ( z^ : 
0  11  nn        11  nn 


is      a      rule      in      the 


program   tree,   the   nodes   B.(y.)   anc 


C.(z.)  have  been  extended,  B.(s(y.))  is  in  the 
3      3  11 

extension   of    B.(y.)    and   C.(s(z.))    is   not   in   the 

11  D     D  

extension    of    C.(z.),    then    place    A(s(x_))    in    the 

extension  of  A(x_).  When  all  such  placings  have  been  made, 

mark  A(Xp.)  extended. 

(4)   Repeat   (3^   until   the   root   of   the   tree   is   marked 

extended. 
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3.3  Correctness  of  Backchain-Iteration 

The  definitions  in  the  last  section  allow  us  to  prove 
that  an  answer  tuple  is  in  the  output  from  a  query  if,  and 
only  if,  it  follows  from  the  knowledge  base.  Formally,  this 
is  stated  as : 

Theorem  1  Let  T  be  an  extended  program  tree  with 
root  A(x_),  and  let  s  be  a  substitution.  Then  A(s(x_)) 
is  in  the  extension  of  the  root  of  T  iff  A(s(x^))-!. 

Proof   Without  loss  of  generality,  assume  that  there 
is  just  one  clause  at  the  root  of  the  tree,  and  let  it  be 

A(x_  X-Bt  (y^  )  .  .B  (y  ) -C-,  (  z,  )  .  . -C  (z  ) 
0     l-'l     n-'n    11      nn 

Case  m+n=0;  The  extension  of  A(x„)  is  defined  as 

{A(x„')  !  there  exists  a  substitution  s  such 

that  s(Xq)=Xq'  and  A(Xq')<-} 

so,  from  the  definition  of  -!,  A(x_')  is  in  the  extension 
of  A(X|^)  at  the  root  of  the  tree  iff  A{x_')-!. 

Case  m+n>0:    Suppose  A(s(x^))   is  in  the  extension  of  the 

root  of  T,  for  some  s.  Then,  by  definition  of  the  extension 

of  T,   B.(s(y.))   is   in  the  extension  of  the  node  marked 

i     i, 
B.{y.),   and  "C  .  ( s  ( z  .  )  )   is   not   in   the   extension   of   the 

i   i  i     i 

node   marked   C-:(z.)r   As   an   inductive   hypothesis,   assume 

that  the   presence   of   B.{s(y.))   in   the  extension  of  the 

node  marked    B.(y.)    implies    that    B.(s(y.))-!,    and 

that  the  absence  or  C.(s{z.))   from  the  extension  of  the 

node  marked  C.(z.)   implies  -^that  it  is  not  the  case  that 

C.(s(z.))-!.    Then,    from    the    definition    of    -!, 
3  3 

A( s(x_ ) )-! . 

A  similar  argument  establishes  that  if  A(s(x_))-! 
then  A(s(x„))  is  in  the  extension  of  the  root  of  T.  [] 

The  theorem  applies  to  the  case  in  which  no  rule  calls 
itself,  i.e.  in  which  the  set  of  rules  is  such  that  it  is 
not  possible  for  a  literal  to  be  repeated  along  a  path  in  a 
program  tree.  A  more  general  case  is  discussed  in  section  4. 
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3.4  Power  of  Non-Recursive  Backchain-Iteration  is  that  of 
the  Relational  Algebra 

This  section  shows  that  if  a  query  can  be  written  in  the 
relational  algebra  [5],  then  it  can  also  be  written  in 
SYLLOG.  (We  shall  sometimes  refer  to  the  relational  algebra 
simply  as  the  algebra).  Since  algebra  expressions  are 
(tacitly)  formulated  to  be  non-recursive,  we  shall  see  that 
the  corresponding  sets  of  syllogisms  are  also  free  of 
recursion.  Thus  SYLLOG,  without  recursive  syllogisms,  has  at 
least  the  power  of  the  algebra. 

Following   [9],   we   take   the   five   operations   union, 
set   difference,   cartesian   product,   project,   and   select 

to  define  the  algebra.  So  SYLLOG  is  as  powerful  as  the 
algebra  if  it  can  be  shown  to  simulate  any  inductive 
combination  of  these  five  operations.  However  it  would  be 
quite  inconvenient  in  practice  to  use  only  SYLLOG 
equivalents  of  these,  so  wc  also  show  how  the  relational 
algebra  operations  natural  join,  intersection,  and 
quotient  can  be  written  in  SYLLOG. 


Theorem  2  If  a  query  can  be  written  in  the 
relational  algebra,  then  it  can  also  be  written  in  SYLLOG. 

Proof  We  shall  use  the  internal  rule  form,  since  the 
translation  between  this  and  the  syllogism  form  is 
straightforward.  For  each  of  the  algebra  operations,  we  show 
an  equivalent  rule,  or  set  of  rules. 

1.  Union   In  the  algebra, 

R  =  R^  U  R2  =  {  Xj^  !  R^(Xj^)  V  R2(Xj^)  } 

In  rules 

R(x,  )  <-  R^  (X,  ) 
k      1   k 

R(x,  )  <-  R-(x,  )  . 
k      2   k 

2.  Set  difference 

R=R-[^-R2=Uj^!R^(Xj^)  and  not  R  ( x,  )  } 

R(x,  )  <-  R,  (X,  )  -R_(x,  ) 
k       Ik     2   k 

3.  Cartesian  product 


R=R,  X  R„={<x  ,x.>!R-  (x,  )  and  R^(x.) 
R(  x,  ,x  .  )<-Rj^(Xj^)R2  (X  .  ) 
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4.  Project 

R  =  PROJ .      .   R^  = 

{  <x.  , . . , X .  >  ! 

1,      1 
1      m 

there  exists  <yi,..fy  >  in  R, 
-'I    ■'n      1 

such  that  X.  =y.   for  j=l..m  } 
1  .  -^  1  .      -■ 
3         3 

where  x.  ,  y,  and  y.   are  domain  variables 
1  .   -'k      -^  1  . 

3  3 

[10]. 

R(x^  ,..,x^  )  <-  R-^iy-^,.  .  ,Y^)  . 
1      m 

5.  Select 

R  =  SEL„  R,  (x,  )  = 
P   1   k 

{  Xj^  !  R(x,  )  and  P(x,  )  } 
where  P  is  a  predicate  defined  in  terms  of: 
(i)  operands  that  are  constants  or  variables, 
( ii )  lexical  or  arithmetic  comparison  operators 

<,=,>,<  or  =,   *,  >or=, 
(iii)  logical  operators  and,  or,  not. 


We  give  some  examples  of  translation  of  select 
expressions  into  SYLLOG .  Using  these,  it  is  straightforward 
to  inductively  decompose,  then  translate,  an  arbitrary 
select.  Let  @  and  %  be  comparison  operators. 


R(x,y)  =  SELj^gy  R^(x,y)  = 

{<x,y>    !    R    (x,y)    and    x@y} 
R(x,y)    <-   R^(x,y)    ia(x,y) 


R(x,y)    =    SEL    „  JO       R. (x,y)    = 

-^  xiay    and    x%y      1         -^ 

{<x,y>!    R    (x,y)    and    x(ay    and    x%y} 
R(x,y)<-R^(x,y)(a  {x,y)%(x,y) 
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R(x,y)  =  SEL^^y  ^^  ^^y  Ri(x,y)  = 

{<x,y>!  R,(x,y)  and  (x@y  or  x%y)} 
R(  x,y)  <-  R(x,y)  (a(x,y) 
R(x,y)  <-  R(x,y)  %(x,y) 


R( x)  =  SEL   ,  ,  ^  .R, (x)  = 
not(x(ac)  1 

{  X  !  R, (x)  and  not  x§c  } 
R(x)  <-  R^(x)  -@ (x,c) 


The  above  five  operations  are  sufficient  to  define  a 
basis  for  the  relational  algebra  [10],  and  it  is  clear  that 
the  translation  of  any  algebraic  expression  into  a  set  of 
SYLLOG  rules  can  be  constructed  in  a  straightforward  way 
using  the  indicated*  translations  for  individual  operations. 
It  follows  easily  from  Theorem  1  that  the  constructed  SYLLOG 
rules  will  yield  the  same  result  as  the  relational  algebra 
expression.  This  completes  the  proof  of  Theorem  2.  [] 

We   now   give   examples   of   how  to  write   the   algebra 
operators   natural   join,   intersection,   and   quotient   as 

SYLLOG  rules  (and  hence  as  syllogisms). 


The  natural  join 


R  =  R^  *  R2  = 


{<x,y,z>  !  R,(x,y)  and  R^(x,z)} 
is  written  as  the  SYLLOG  rule 


R{x,y,z)  <-  R,(x,y)  R  (x,z) 
The  intersection 


R  =  R,  &  R2  = 
{  X   !  R^(Xj^)  and  R  (x  )  } 
is  written 


R(  x,  )  <-R,  (  X,  )R2(  X,  ) 
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The  quotient 

T  =  R/S 
which  is  defined  by 

{  X  !  y  in  S  implies  <x,y>  in  R  } 
yields  the  set  of  rules 

T(x)  <-  R^(x)  -R2(x) 

R^(x)  <-  R(x,y) 

R2(x)  <-  R^(x)  S(y)  -R(x,y) 

Thus,  not  only  is  non-recursive  SYLLOG  formally  as  powerful 
as  the  relational  algebra,  but  all  of  the  algebra  operators 
except  division  have  simple  transliterations  in  SYLLOG. 
Since  the  quotient  operator  is  not  widely  used,  its  indirect 
expression  in  SYLLOG  is  a  minor  disadvantage. 
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4.  BACKCHAIN- ITERATION  and  RECURSIVE  SYLLOGISMS 

In  section  2 . 5  we  used  a  syllogism 

can  go  by  train  from  _village  to  _  Hoboken 
can  go  by  train  from  _Hoboken  to  _Newark 

can  go  by  train  from  _village  to  _Newark 

Stated  as  a  rule,  this  is 

C(x,z )  <-  C(x,y)  C(y,z) 

and  it  is  clearly  recursive.  In  fact,  the  rule  expresses  the 
transitive  closure  of  the  asserted  tuples  in  C,  an  operation 
which  cannot  be  expressed  in  the  relational  algebra  [1]. 
Thus,  if  we  allow  such  rules,  SYLLOG  is  strictly  more 
powerful  than  the  relational  algebra.  This  section  describes 
a  technique  for  extending  the  definitions  of  section  3.2  to 
cover  the  recursive  case. 

4.1  An  Example 

Suppose  we  have  a  knowledge  base  containing  the  rule 
shown  above,  and  we  relax  our  constraint  that  a  predicate 
should  not  be  both  the  subject  of  an  assertion  and  the  left 
side  of  a  rule.  Then  we  can  also  assert  that  some  tuples  are 
in  C.  Suppose  we  assert  that  the  tuples 

a  b 

b  c 

c  d 

d  e 

e  f 

are  in  C,  so  that  the  knowledge  base  contains  the  assertions 
C(a,b)<-,  ...,  C(e,f)<-,  together  with  the  rule  mentioned 
above . 

If  we  now  ask  the  question 

can  go  by  train  from  a  to  f 


SYLLOG  will  proceed  to  backchain  from  C(a,f).  Rather  than 
producing  an  infinite  tree,  the  backchain  produces 
the   program   tree 
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C(a,f ) 


C(y,f) 


C( a,u) 


C(u,y) 


C(y,v) 


C(v,f ) 


In  growing  the  tree  downwards,  the  stopping  criterion  is  to 
not  add  to  the  tree  a  rule  which  is  isomorphic  ( in  a  sense 
to  be  defined  in  section  4.2)  to  a  rule  above  it. 


Next,  we  do  the  iteration  part  of  backchain-iteration , 
as  described  in  section  3.2.  Writing  the  extensions  next  to 
the   tree   gives 


{ac} 


{df } 


C(a,u) 
{ab, . , } 


C(u,y) 
{ be ,  .  .  } 


C(y,v) 
{de,..} 


C(v,f ) 
{ef ,..} 


where  {}  denotes  the  empty  set.  Now  suppose  that,  each  time 
a  tuple  is  placed  in  the  extension  of  a  node,  it  is  also 
placed  in  the  set  of  tuples  of  C.  Then,  after  the  first 
iteration,  C  contains  the  extra  rows 


a  c 
d  f 

It  is  easy  to  see  that,  if  we  now  repeat  the  computation  of 
the  extension,  the  root  extension  will  be  made  to  contain 
the  tuple  a  f,  yielding  the  required  positive  answer. 

So,  a  method  of  dealing  with  a  recursive  rule  is  to 
compute  the  extension  of  a  finite  tree  not  once,  as  in  the 
non-recursive  case,  but  repeatedly,  until  it  is  noted  that 
no  change  occurs  in  the  extension  of  any  node.  Clearly, 
there  are  ways  of  refining  this  to  make  it  more  efficient, 
but  the  principle  is  straightforward. 

Note  that,  in  this  example,  backchain-iteration 
instantiates  to  a  program  which  is  a  two-sided  graph  search 
with  specified  end  nodes.  Thus  if  the  graph  of  the  relation 
C  contains  subgraphs  which  are  not  connected  to  the  end 
nodes  a  or  f,  then  the  corresponding  subrelations  are 
not  searched. 
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4.2  Definition  of  Finite  Backchaininq  for  Recursion 

In  our  example,  we  mentioned  that  backchaininq  is 
halted  when  we  are  about  to  add  to  the  proqram  tree  a  rule 
which  would  be  isomorphic  to  one  above  it.  Our  workinq 
definition  of  isomorphic  is  as  follows: 

Let  Rule  1 

A(x„  X-B,  (y,  )  .  .B  (y  )-Ct(z,)..-C  (  z„  ) 
0     l-'l     n-'n    11      nN 

be  part  of  the  tree,  and  let  Rule  2,  which  we  are  decidinq 
whether  or  not  to  add,  be 

A'  (x„'  X- 

B'^(y^')..B'^(y^')-C-,(z^-)..-C'^(z^-) 

If    A±=A',    B^±B^',    or    0.4=0."    for    some    i    and    j 

(i.e.  if  the  rules  cannot  be  arranqed,  preservinq  neqation, 
to  have  the  same  predicate  names  in  the  same  left  to  riqht 
order),  then  the  rules  are  not  isomorphic.  If  the  rules  do 
have  the  same  sequence  of  predicate  names,  but  there  is  no 
substitution  s  such  that  s(x_')=x„,  s(y.')=y.,  and 
s(z.')=z.  for  all  i  and  j,  then  the  rules  are  st'ill  not 
isomorphic.  If  there  is  such  an  s,  let 

c  =  CARD{  X  !  s(x)=x,  x  is  a  constant} 

and 

V   =   CARD{<x,b>    !    s(x)=b,    x    is    a   variable 

and  b  is  a  constant}, 

where  CARD  denotes  the  cardinality  of  a  set,  and  say  that 
the  rules  are  isomorphic  unless  c  is  qreater  equal  1  and  v 
is  qreater  equal  1. 
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To  see  how  this  definition  works  in  halting  the 
backtrack  in  the  example  in  section  4.1,  suppose  we  are  at  a 
stage  when  the  program  tree  contains  only 

C(a,f)  <-  C(a,y)  C(y,f)  (Rule  1) 

and  that  we  are  considering  whether  or  not  to  add  some 
instance  of 

C(x,z)  <-  C(x,u)  C(a,z) 

below  C(a,y).  Clearly,  the  required  instance  is 

C(a,z)  <-  C{a,u)  C(u,z)  (Rule  2). 

Rules  1  and  2  become  identical  under  the  substitution 
s(a)=a,  s(z)=f,  s(u)-y.  This  substitution  has  c  greater 
equal  1  by  reason  of  s(a)  =  a  and  v  greater  equal  1  by  reason 
of  s(z)  =  f,  so  the  two  rules  are  not  isomorphic,  and  Rule  2 
is  added  to  the  tree. 

Next,  suppose  that  Rule  2  is  in  the  tree,  and  that  we 
are  considering  whether  to  add 

C(a,v)  <-  C(a,w)  C(w,v)  (Rule  3) 

below  the  C(a,u)  in  Rule  2.  Rule  3  and  Rule  2  become 
identical  under  the  substitution  s(a)=a,  s(v)=z,  s(w)=u.  For 
this  s,  c  is  greater  equal  1,  but  v  is  less  than  1,  so  Rules 
2  and  3  are  isomorphic,  and  Rule  3  is  not  added  to  the 
program    tree. 
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4.3  Iteration  for  Recursion 


the 


of  the 
since 


In  the  present,  experimental,  version  of  SYLLOG , 
program  tree  extension  iteration  is  repeated  until  no 
extension  is  changed  in  a  full  bottom  to  top  scan 
tree.  This  is  wasteful  in  the  absence  of  recursion, 
twice  as  much  computation  may  be  done  as  is  needed;  a  first 
scan  is  made  to  get  the  answer,  and  a  second  scan  is  made  to 
check  that  it  is  indeed  the  whole  answer.  Theorem  1  assures 
us  that,  if  we  detect,  at  backchain  time,  that  there  is  no 
recursion,  then  a  single  extension  scan  is  sufficient.  On 
the  other  hand,  if  recursion  is  present,  it  is  easy  to  adapt 
the  example  in  section  4,1  to  show  that  we  cannot  limit  the 
number  of  extension  scans  in  advance. 


The  repeated  scan  of  the  whole  tree,  each  rule  at  each 
level  being  executed  once  at  each  scan,  can  actually  be 
incorrect  if  both  recursion  and  negation  are  present, 
(although  there  is  a  simple  way  making  it  correct).  To  see 
this,  consider  the  rules 

T(x,z)  <-  R(x,z)  -S(x,z) 

S( X, z )  <-  S(x,y)  S(y, z ) 

together  with  the  data  R(a,d),  S(a,b),  S(b,c),  S(c,d).  The 
correct  answer  to  the  query  T(x,z)  is  EMPTY,  but  the  first 
scan  of  the  program  tree 


T(  X, z  ) 


R(  X, z  ) 


S(x,y) 


S(y,z) 
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will  place  the  tuple  a  d  in  the  extension  of  the  root,  and 
subsequent  scans  will  not  remove  it.  One  way  of  correcting 
this  is  to  repeatedly  extend  the  S  subtree  before  extending 
the  root  of  the  tree. 

It  can  also  be  necessary,  in  some  cases,  to  repeatedly 
scan  a  local  subtree.  For  example,  if  the  rules  are 

A{x, z )  <-  B( x,y )  A( y, z ) 

B  (  X ,  z  )  <  -  A  (  x ,  y )  B  f  y ,  z  ) 

and  the  data  are  A(a,b),  A(c,d),  B(b,c),  B(d,e),  then  the 
tree 


B(x,z) 


B(y,z 


B( x,u) 


A( u,y ) 


must  be  scanned  twice  to  determine  that  B(a,e).  If  this  tree 
is  a  subtree,  and  its  root  is  the  subject  of  negation  higher 
in  the  main  tree,  then  the  local  repeated  scan  must  be  made 
before  scanning  higher. 

Since  it  is  not  easy  to  find  real  examples  of  data  base 
retrievals  which  require  recursion  beyond  the  simple  form 


R( X, z )  <-  R( x,y)  R(y, z ) 

needed  for  transitive  closure,  a  good  compromise  between 
generality  and  computational  cost  appears  to  be  to  reject 
backchain  trees  which  contain  more  complicated  recursions, 
and  to  execute  the  admissible  ones  by  repeated  local 
scanning    only. 
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CONCLUSIONS 


The  SYLLOG  system,  which  has  been  prototyped  in  SETL, 
provides  a  simple,  English-like  language  in  which  a 
non-programmer  can  set  up  and  use  a  data  base.  The  language 
prompts  the  user  by  showing  a  set  of  standardized  English 
sentences  on  the  screen,  and  the  user  makes  a  command  by 
choosing  one  of  these  sentences  and  modifying  it.  The 
language  is  designed  for  interactive  use  at  a  screen,  and 
would  be  most  suitable  for  use  with  a  light  pen  plus 
occasional  key  strokes.  So  far,  the  language  has  been 
implemented  using  a  line  editor,  and  separately  using  a 
visual  editor.  The  number  of  key  strokes  needed  is  quite 
small . 

The  standardized  English-like  sentences  are  grouped 
into  syllogisms,  which  function  as  a  way  of  encoding 
knowledge  about  a  particular  domain,-  e.g  travel,  dentistry 
etc.,-  for  use  in  query  processing.  A  query  is  a  single 
sentence,  and  it  triggers  a  search  of  the  domain  knowledge, 
followed  by  a  search  of  a  relational  data  base.  Thus  the 
domain  knowledge  mediates  between  the  user  and  the  data 
base. 

The  order  in  which  syllogisms  are  made  known  to  SYLLOG 
has  no  effect  on  the  result  of  a  query,  so  that  the  language 
may  fairly  be  said  to  be  non-procedural.  Recursive 
syllogisms  are  allowed,  hence  the  power  of  SYLLOG  exceeds 
that  of  the  relational  algebra  (and  of  the  relational 
calculus),  yet  there  is  no  possibility  for  an  inexperienced 
user  to  take  the  system  into  an  infinite  loop. 

The  two  preprocessing  stages  for  a  SYLLOG  query,  namely 
the  translation  from  English  to  predicate  form  followed  by 
the  construction  of  a  program  tree,  are  reasonably 
straightforward,  and  require  little  space  or  time  in  the 
computer.  Once  a  program  tree  has  been  constructed,  the 
computer  resources  needed  for  its  execution  are  similar  to 
those  needed  for  any  relational  data  base  system. 

We  note  that  a  method  for  converting  recursive  rules  in 
a  data  base  intension  into  iterative  programs  over  the 
extension  has  also  been  suggested  in  [2].  In  that  method, 
relations  are  partitioned  into  those  which  are  derived  and 
those  which  are  asserted,  whereas  we  find  it  useful  to  mark 
individual  tuples  as  either  derived  or  asserted.  Also,  in 
[2],  a  rule  must  be  regular,  in  the  sense  that  the  premise 
may  contain  at  most  one  derived  relation;  hence  either  the 
user  must  be  restricted  to  only  declare  regular  rules,  or  a 
general  method  must  be  found  to  convert  irregular  rules  into 
regular  ones.  As  stated  in  [2],  "finding  a  good  program  from 
a  recursive  query  (graph)  is  a  fruitful  area  of  research". 
Our  finite  backchain  algorithm  appears  to  be  a  step  in  this 
direction. 
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Although  the  knowledge  contained  in  a  set  of  syllogisms 
greatly  simplifies  matters  for  the  user,  this  paper  has  only 
described  the  use  of  the  knowledge  for  query  processing. 
Syllogisms  can  also  be  used  for  type-checking,  and  to 
express  constraints  which  can  be  automatically  enforced 
whenever  an  update  is  made.  The  use  of  syllogisms  to  express 
constraints  is  discussed  in  [11].  The  related  matter  of 
updates  into  syllogistically  defined  views  of  a  data  base 
remains  as  an  interesting  topic  for  future  work. 
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